Method and apparatus for multiplying based on booth&#39;s algorithm

ABSTRACT

A multiplying apparatus and method based on Booth&#39;s algorithm are disclosed. According to a multiplier index, a one of several predetermined multiplier coefficient sets can be chosen. Each multiplier coefficient set contains several multiplier coefficients that are generated according to a predetermined multiplier value by Booth&#39;s algorithm. Then the multiplier coefficients can be used to generate the partial products according to a multiplicand by Booth&#39;s algorithm. By summing all of the partial products, an output value can be generated.

BACKGROUND OF THE PRESENT INVENTION

1. Field of the Invention

The invention relates to an apparatus and method for multiplying, and more particularly, to an apparatus and method for multiplying based on Booth's algorithm.

2. Description of the Prior Art

Discrete cosine transform (DCT) and inverse discrete cosine transform (IDCT) are used for data compression and date decompression respectively. One of well-known DCT and IDCT technology is a Fast Fourier Transform (FFT) base on Lee's algorithm. FIG. 1A is a diagram of a shuttle exchange circuit base on Lee's algorithm for DCT. The DCT is divided into a first stage computation, a second stage computation, a third stage computation and a fourth stage computation, thereby eight parallel output value (Y0,Y1, . . . , Y7) can be evaluated by the four stage computations according eight parallel input value (X0,X1, . . . , X7). There are two function blocks within FIG. 1A: DCT processor 1 and post-processor 2. DCT processor 1 is constructed by 12 similar process means 3 designed by the butterfly circuits, and the post-processor constructed by five adding means 4 5 and a fixed coefficient multiplication means is connected with the DCT processor 2 thereafter. Each process means 3 comprises an adder 31, a subtractor 32 and fixed coefficient multiply 5. There are four signed A, two signed B, two signed C, one signed D, one signed E, one signed F. The coefficient signed A, B, C, D, E, F and G are $\begin{matrix} {{\frac{1}{2}{\cos\left( {\pi/4} \right)}},{\frac{1}{2}{\cos\left( {\pi/8} \right)}},{\frac{1}{2}{\cos\left( {3\quad{\pi/8}} \right)}},{\frac{1}{2}{\cos\left( {\pi/16} \right)}},} \\ {{\frac{1}{2}{\cos\left( {3\quad{\pi/16}} \right)}},{\frac{1}{2}{\cos\left( {7\quad{\pi/16}} \right)}\quad{and}\quad\frac{1}{2}{\cos\left( {5\quad{\pi/16}} \right)}}} \end{matrix}$ respectively. If there is no concern on the adders, subtractors, muliplies, no control means is needed in FIG. 1A. The DCT data-flow dependence without any control means can be designed as data-flow architecture.

In corresponding with FIG. 1A, FIG. 1B is a diagram of an IDCT circuit base on Lee's algorithm. The IDCT is divided into a first stage computation, a second stage computation, a third stage computation and a fourth stage computation, thereby eight parallel output value (X0,X1, . . . , X7) can be evaluated by the four stage computations and eight parallel input value (Z0,Z1, . . . , Z7). There are two function blocks within FIG. 1B: IDCT processor 7 and pre-processor 6. IDCT processor 7 is constructed by 12 similar process means 8 designed by the butterfly circuits, and the pre-processor 6 constructed by five adding means 9 and a fixed coefficient multiplication means 10 is connected with the pre-processor 6 therebefore. Each process means 8 comprises an adder 81, a subtractor 82 and fixed coefficient multiply 10. There are four signed A, two signed B, two signed C, one signed D, one signed E, one signed F, and one signed G of the fixed coefficient multiplies 8 of all process means 8. The coefficient of fixed coefficient multiplies 10 signed A, B, C, D, E, F are the same with the coefficients in FIG. 1A respectively. Another well-known DCT/IDCT algorithm is Chen's algorithm. The other details of Lee's algorithm and Chen's algorithm can reference U.S. Pat. No. 5,452,466 and U.S. Pat. No. 5,841,682.

General speaking, multiply costs more space and computing time than adder, especially, the hardware cost for implementing multiply is much more than the hardware cost for implementing adder. Therefore, most cost of DCT and IDCT are spent on multiplication, thereby many improved multiplies are applied in DCT and IDCT. One of the improved multiplies is based on Booth's algorithm whose details can be referenced U.S. Pat. No. 5,485,413. By the Booth's algorithm, referring to FIG. 2A, the multiplier is transformed into a coefficient set comprises a plurality of multiplier coefficients in Step 220. Then Step 240 generates a plurality of partial products by multiplying a multiplicand with the coefficient set. Finally, in step 260, adding all partial products generates the product. By this way, a multiply according to the above mentioned multiplication method can be designed. Referring to FIG. 2B, the coefficient set 221 contains a plurality of multiplier coefficients 222 is transformed by a coefficient generation means 22. Next, partial products generation means 26 generates a plurality of partial products 242 according to the multiplier coefficients 222 and a multiplicand 212. Finally, the partial products 242 are added by the summating means 26 to generate the sum 262 of the multiplication of the multiplier 211 and multiplicand 212. Because the number of the multiplier coefficient is less than the number of bits of the multiplicand, thus the number of partial products 242 will be less too. Therefore, the cost and performance can be much improved.

For the conventional technologies, there are seven similar multiplies for DCT and IDCT, but almost of them comprise so many computation processes. That is, so many computing costs are needed. Thus, less computations of the multiplication are made, less cost is needed.

SUMMARY OF THE PRESENT INVENTION

Accordingly, the present invention provides a method and apparatus for multiplying based on Booth's algorithm to simply the computation by decreasing multiplier coefficients.

The present invention provides a method for multiplying according a multiplication indexes to choose a multiplication coefficient set from a plurality of multiplication coefficient sets. Each multiplication coefficient set comprises a plurality of multiplication coefficients transformed from a determined multiplier. Then multiplying the multiplication coefficients with a multiplicand can generate a plurality of partial products and finally an output value can be generated by summing all multiplication coefficients.

The present invention also provides a multiplication apparatus, comprising: a coefficient generation means, in which one of a plurality of coefficient sets with a plurality of coefficients generated by Booth's algorithm is chosen in accordance with a multiplier; a partial products generation means, in which products are generated by multiplying chosen coefficient set with a multiplicand; and a summing means for generating an output value by summing all partial products.

BRIEF DESCRIPTION OF THE DRAWINGS

A better understanding of the present invention can be obtained when the following Detailed Description is considered in conjunction with the following drawings, in which:

FIG. 1A and FIG. 1B are the structure block diagrams of the prior art;

FIG. 2A and FIG. 2B are the function block diagrams based on Booth's algorithm of one embodiment of the present invention;

FIG. 3 is the flowchart diagram of another embodiment of the present invention; and

FIG. 4 is the functional block diagram of further embodiment of the present invention.

DETAIL DESCRIPTION OF THE PREFERRED EMBODIMENT

The feature of Booth's algorithm is to replace multiplier with a plurality of multiplier coefficients for multiplying with the multiplicand to generate partial products, wherein the product can be generated by summing all partial products. Thus, the present is an improvement of the feature of Booth's algorithm in the specific condition that all of the possible multipliers are predetermined (which means each possible multiplier is chosen form a fixed group). Namely, all possible multipliers are known. Therefore, each possible multiplier can be replaced with a multiplier index corresponding to a multiplier coefficient set with a plurality of multiplier coefficients. Accordingly, the corresponding multiplier coefficient set can be indexed directly and the cost can be down because of no transforming for generating the multiplier coefficients after the multiplier is determined in the prior art. Besides, because all possible multipliers are determined, each one of them can be identified by less bits. For examples, the number of possible multipliers is 8 that can be identified by 3 bits, even if the number of bits in a multiplier is 16 with 2¹⁶ possible values.

Moreover, the output of the product may be part of the product rather than the whole product. For instance, the output value may be taken the integer and some of the decimal place or only some of the integer in the product. For examples, if the product is identified by 40 bits, then the values of all possible products are in the boundary of 23 bits. In this case, only 23 bits for output will work.

Furthermore, if the output is above mentioned part of the product, then the bits from the most significant bit to the bits for outputting are assigned as a high bit set and the rest part of the product is assigned as a low bit set. In the summing, the high bit sets and the low bit sets of all partial products are summed to be a high bit product and a low bit product respectively, wherein the low bit product comprises a carry out value identified by other bits except the bits of the low bit set for being summed with the high bit product. The sum of the high bit set and the carry out value can be the output. Accordingly, it is not necessary to reserve the other bits in the low bit product except the carry out value whereby the cost is down.

The present invention can be used for integer, floating point value, fixed point value or other value types. Besides, the value type of the present invention can be identified as binary, nibble, decimal, hexadecimal and so on, the type and the way to identify the value in the present invention are not limited.

Besides, the manner to choose a multiplier coefficient set can be indexed by a lookup table. The lookup table records the correspondent relationship of the multiplier index and the multiplier coefficient set and can be implemented in memories, state-latched circuits or other storage media. The multiplier index can be used for the address or the control signals to index the multiplier coefficients of correspondent multiplier coefficient set, in which all of this can be integrated in a logical circuit. The illustration of the manner of choosing the multiplier coefficients in the lookup table is for clearly understood, not for confining the implementation of the present invention. The present invention does not limit the implementation for choosing the multiplier coefficient set by the multiplier index.

Accordingly, referring to FIG. 3, one embodiment of the present invention is a method for multiplying based on the Booth's algorithm. Firstly, step 320 chooses a correspondent one of the determined multiplier coefficient sets according to a multiplier index. Each one of the multiplier coefficient sets comprises a plurality of multiplier coefficients according to the Booth's algorithm. Namely, all possible values of the multiplier are determined and each possible value of the multiplier corresponds to a set of multiplier coefficients, transformed according to Booth's algorithm, indexed by a respective multiplier index. That is, each multiplier index corresponds to a set of multiplier coefficients. Besides, the values corresponding to different multiplier indexes may be the same.

Next, step 340 generates a plurality of partial products by multiplying the multiplier coefficients with a multiplicand according to the Booth's algorithm.

Finally, step 360 sums all partial products to generate an output value. The output value can be the whole product of the multiplication or above mentioned part of the product. The other detail of the present invention is described above, there is no redundant description here.

Another embodiment of the present invention is an apparatus for multiplying base on Booth's algorithm, referring to FIG. 4, comprising a coefficient generation means 42, a partial product generation means 24 and a summing means 46. The coefficient generation means 42 choose one of a plurality of coefficient sets to be a multiplier coefficient set 221, wherein each coefficient set comprises a plurality of multiplier coefficients 222 transformed by a determined multiplier based on Booth's algorithm. Next, the partial product generation means 24 generates the partial products 242 according to the multiplier coefficient set 221 and a multiplicand 212 based on Booth's algorithm. Finally, the summing means 46 sums all partial products 242 to generate an output value 463. As mentioned above, the output value can be generated from the product summed from the high bit product 441 and the low bit product 442. The high bit product 441 and the low bit product 442 can be summed from the high bits 2421 and the low bits 2422 of the partial products 242 respectively, wherein the low bit product 442 comprises the foregoing carry out value 4421 and the output value 443 can further be generated according to the high bit product 441 and the carry out value 4421. The other details of the embodiment is described above, there is no redundant description here. Significantly, while the function of each of coefficient generation means 42, partial product generation means 24 and a summing means 46 is clear, anyone in the skill art could implement the coefficient generation means 42, partial product generation means 24 and a summing means 46 without any difficulty. For example, coefficient generation means 42 might be a hardware circuit, such as a combination of a multiplex and some additional units. For example, the partial product generation means might be a hardware circuit which is a combination of several multiplies. For example, the summing means might be a hardware circuit, which is a combination of several adders.

Accordingly, further embodiment of the present invention is a apparatus for multiplying base on Booth's algorithm in DCT/IDCT, i.e. the fixed point multiplication means in Lee's algorithm. The multiplicands in the multiplication of lee's algorithm are fixed point values and the values can be cosine values or sine values, i.e. $\begin{matrix} {{\frac{1}{2}{\cos\left( {\pi/4} \right)}},{\frac{1}{2}{\cos\left( {\pi/8} \right)}},{\frac{1}{2}{\cos\left( {3\quad{\pi/8}} \right)}},{\frac{1}{2}{\cos\left( {\pi/16} \right)}},} \\ {{\frac{1}{2}{\cos\left( {3\quad{\pi/16}} \right)}},{\frac{1}{2}{\cos\left( {7\quad{\pi/16}} \right)}\quad{and}\quad\frac{1}{2}{{\cos\left( {5\quad{\pi/16}} \right)}.}}} \end{matrix}$ Moreover, the embodiment can also be the fixed point multiplication means in Chen's algorithm. Besides, the foregoing DCT/IDCT can further apply in digital multimedia apparatuses, i.e. VCD player, DVD player, HDTV and so forth. The other details of the embodiment is described above, there is no redundant description here.

What are described above are only preferred embodiments of the invention, not for confining the claims of the invention; and for those who are familiar with the present technical field, the description above can be understood and put into practice, therefore any equal-effect variations or modifications made within the spirit disclosed by the invention should be comprised in the appended claims. 

1. A method for multiplying based on Booth's algorithm, comprising: choosing a multiplier coefficient set according to a multiplier index, wherein said multiplier coefficient set comprises a plurality of multiplier coefficients transformed by Booth's algorithm according to a determined multiplier corresponding to said multiplier index; generating a plurality of partial products by multiplying said multiplier coefficients with a multiplicand while using a Booth's algorithm; and summing said partial products to generated an output value.
 2. The method for multiplying based on Booth's algorithm according to claim 1, wherein said multiplier coefficients are indexed by said multiplier index in a lookup table, wherein said lookup table comprises the correspondent relations of a plurality of multiplier indexes and a plurality of multiplier coefficient sets.
 3. The method for multiplying based on Booth's algorithm according to claim 1, wherein the sum of said partial products is a sum of the multiplication of said determined multiplier and said multiplicand.
 4. The method for multiplying based on Booth's algorithm according to claim 1, wherein said product is a set of binary bits and said output value is formed by partial bits of said product.
 5. The method for multiplying based on Booth's algorithm according to claim 1, wherein each of said partial products is a set of binary bits comprising a high bit set and a low bit set, wherein the sum of said high bit set of said partial products is a high bit product and said product is the sum of said high bit product and a carry out value that is the rest bits except said low bit set in the sum of said low set of said partial products.
 6. The method for multiplying based on Booth's algorithm according to claim 1, wherein said output value is a sum of said high bit product and said carry out value.
 7. The method for multiplying based on Booth's algorithm according to claim 1, wherein said multiplier index is chosen from the floating point value and the fixed point value, and is identified by the following representations: binary, nibble, decimal and hexadecimal.
 8. The method for multiplying based on Booth's algorithm according to claim 1, wherein said multiplicand is chosen from the floating point value and the fixed point value, and is identified by the following representations: binary, nibble, decimal and hexadecimal.
 9. An apparatus for multiplying based on Booth's algorithm, comprising: a coefficient generation means for choosing one of a plurality of coefficient sets to be a multiplier coefficient set comprising a plurality of multiplier coefficients transformed by Booth's algorithm according to a determined multiplier corresponding to said multiplier index; a partial product generation means for generating a plurality of partial products by multiplying said multiplier coefficients with a multiplicand; and a summing means for summing said partial products to generate an output value.
 10. The apparatus for multiplying based on Booth's algorithm according to claim 9, wherein said multiplier coefficients are indexed by said multiplier index in a lookup table, wherein said lookup table comprises the correspondent relations of a plurality of multiplier indexes and a plurality of multiplier coefficient sets.
 11. The apparatus for multiplying based on Booth's algorithm according to claim 9, wherein the sum of said partial products is a product of the multiplication of said determined multiplier and said multiplicand.
 12. The apparatus for multiplying based on Booth's algorithm according to claim 9, wherein said product is a set of binary bits and said output value is formed by partial bits of said product.
 13. The apparatus for multiplying based on Booth's algorithm according to claim 12, wherein each of said partial products is a set of binary bits comprising a high bit set and a low bit set, wherein the sum of said high bit set of said partial products is a high bit product and said product is the sum of said high bit product and a carry out value that is the rest bits except said low bit set in the sum of said low set of said partial products.
 14. The apparatus for multiplying based on Booth's algorithm according to claim 9, wherein said output value is formed by partial bits of the sum of said-high bit product and said carry out value.
 15. The apparatus for multiplying based on Booth's algorithm according to claim 9, wherein said multiplier index is chosen from the floating point value and the fixed point value, and is identified by the following representations: binary, nibble, decimal and hexadecimal.
 16. The apparatus for multiplying based on Booth's algorithm according to claim 9, wherein said multiplicand is chosen from the floating point value and the fixed point value, and is identified by the following representations: binary, nibble, decimal and hexadecimal.
 17. The apparatus for multiplying based on Booth's algorithm according to claim 9 is applied in discrete cosine transform/inverse discrete cosine transform and said multiplier index is a cosine value.
 18. The apparatus for multiplying based on Booth's algorithm according to claim 17, wherein said cosine value is one of the following group comprising: $\begin{matrix} {{\frac{1}{2}{\cos\left( {\pi/4} \right)}},{\frac{1}{2}{\cos\left( {\pi/8} \right)}},{\frac{1}{2}{\cos\left( {3\quad{\pi/8}} \right)}},{\frac{1}{2}{\cos\left( {\pi/16} \right)}},} \\ {{\frac{1}{2}{\cos\left( {3\quad{\pi/16}} \right)}},{\frac{1}{2}{\cos\left( {7\quad{\pi/16}} \right)}\quad{and}\quad\frac{1}{2}{{\cos\left( {5\quad{\pi/16}} \right)}.}}} \end{matrix}$ 